1 Scripts

The code used for analysis is located on GitHub and compiled together here.

2 Quality control

2.1 Calling nuclei barcodes

  • First, we used Cell Ranger ARC called cell barcodes – algorithm described here. 14,301 barcodes out of 2,215,183.
  • Second, because the Cell Ranger ARC cell calling algorithm is very permissive to barcodes with very low counts (i.e., a minimum of a single count in each library), barcodes were additionally filtered to a low count threshold in both the ATAC and RNA libraries based on the clearly defined population of cells in the RNA and ATAC count scatterplot. Additionally, barcodes with more than 5% of RNA reads mapping to genes on the mitochondrial genome were excluded. A total of 886 barcodes were filtered out for 13,145 left.
  • Last, multiplets were filtered out using two independent methods, relying on either the ATAC or RNA libraries to call multiplets. AMULET relies on the assumption that in snATAC-seq of diploids there should be at most two overlapping fragments with the same cell barcode. The presence of more than two overlapping fragments is a potential indication of a multiplet. Doublets were also identified using the RNA-seq libraries with DoubletFinder. An additional 1,103 multiplet barcodes filtered out, for a final number of 12,042 high quality nuceli barcodes.

2.2 Knee plot

2.3 Scatter plot of counts

Cells after all filtering

Cell Ranger ARC only

And low count threshold

2.4 Violin QC plots

After filtering

Before filtering

2.5 Multiplets

Mapped onto UMAP projection

Counts

Count histograms

Multiplets

After filtering

3 Seurat

3.1 Clusters

Cells per cluster

Acclimation clusters

Together

Split

3.2 Features

3.2.1 Gene expression